CHAPTER 13 Taking a Closer Look at Fourfold Tables 189
When considering the consistency of a binary rating (like yes or no) for the same
item between two raters, you can estimate inter-rater reliability by having each
rater rate the same group of items. Imagine we had two raters rate the same 50
scans as yes or no in terms of whether each scan showed a tumor or not. We cross-
tabbed the results and present them in Figure 13-6.
Looking at Figure 13-6, cell a contains a count of how many scans were rated
yes — there is a tumor — by both Rater 1 and Rater 2. Cell b counts how many
scans were rated yes by Rater 1 but no by Rater 2. Cell c counts how many scans
were rated no by Rater 1 and yes by Rater 2, and cell d shows where Rater 1 and
Rater 2 agreed and both rated the scan no. Cells a and d are considered concordant
because both raters agreed, and b and c are discordant because both raters
disagreed.
Ideally, all the scans would be counted in concordant cells a or d of Figure 13-6,
and discordant cells b and c would contain zeros. A measure of how close the data
come to this ideal is called Cohen’s Kappa, and is signified by the Greek lowercase
kappa: κ. You calculate kappa as:
2
1
2
2
1
(
) (
)
/
ad
bc
r
c
r
c
.
For the data in Figure 13-6,
2 22
16
5
7
27
21
23
29
/
,
which is 0.5138. How is this interpreted?
If the raters are in perfect agreement, then κ = 1. If you generate completely ran-
dom ratings, you will see a κ = 0. You may think this means κ takes on a positive
value between 0 and 1, but random sampling fluctuations can actually cause κ to
be negative. This situation can be compared to a student taking a true/false test
where the number of wrong answers is subtracted from the number of right
answers as a penalty for guessing. When calculating κ, getting a score less than
zero indicates the interesting combination of being both incorrect and unfortu-
nate, and is penalized!
FIGURE 13-6:
Results of two
raters reading the
same set of
50 specimens
and rating each
specimen yes
or no.
© John Wiley & Sons, Inc.